29 min read
Intelligence Cartography The environment is no longer a passive test harness. It is a data engine. 10 dimensions of scaling, from task generation to multi-agent self-play.
The environment is no longer a passive test harness. It is a data engine. 10 dimensions of scaling, from task generation to multi-agent self-play.
Re-examining mid-training as the strategic centerpiece of the LLM pipeline — how it builds the knowledge foundation for agentic RL, and how RL signals are now flowing backward to improve mid-training itself
A Step-by-Step Walkthrough using Slime and SWE-Bench as an Example
From GPU Memory Budgets to Framework Architectures